**Natural Language Processing Project**

The Emotional Side of Social Media Posts: Suicide Prediction based on Reddit Posts using Natural Language Processing Techniques

#Importing necessary libraries to be used

#importing dataset into a variable named data and viewing the top 5 records

# Counting the number of items belonging to different classes.

# Splitting the data into training and test set in a percentage ratio of 67%-33%

#Cleaning the text by removing special characters and stopwords

# Tokenization of the sentences/post. Tokenization is essentially splitting a phrase, sentence, paragraph, or an entire text document into smaller units, such as individual words or terms. Each of these smaller units are called tokens.

# Frequency of words found in posts.

# WordClouds are visual representations of words that give greater prominence to words that appear more frequently.

# Sequence Padding, (pad_sequences) is used to ensure that all sequences in a list have the same length.